Tag

#attention mechanism

7 articles

So you’ve heard these AI terms and nodded along; let’s fix that

Learn to implement and experiment with fundamental AI concepts including neural networks, transformers, and attention mechanisms through hands-on coding exercises.

May 294

NVIDIA AI Releases Gated DeltaNet-2: A Linear Attention Layer That Decouples Erase and Write in the Delta Rule

NVIDIA's Gated DeltaNet-2 decouples erase and write operations in linear attention, outperforming models like Mamba-2 and KDA in long-context tasks.

May 2321

Nous Research Proposes Lighthouse Attention: A Training-Only Selection-Based Hierarchical Attention That Delivers 1.4–1.7× Pretraining Speedup at Long Context

Learn how Lighthouse Attention speeds up AI training on long inputs by selectively focusing on important information, without sacrificing accuracy.

May 1623

Moonshot AI Open-Sources FlashKDA: CUTLASS Kernels for Kimi Delta Attention with Variable-Length Batching and H20 Benchmarks

Learn how to set up and use FlashKDA, an open-source high-performance implementation of Kimi Delta Attention from Moonshot AI, for accelerating attention computation in large language models.

Apr 3026

Xiaomi Releases MiMo-V2.5-Pro and MiMo-V2.5: Matching Frontier Model Benchmarks at Significantly Lower Token Cost

This article explains how Xiaomi's MiMo-V2.5 models achieve frontier-level AI performance with significantly lower token costs, focusing on agentic AI, token efficiency, and advanced optimization techniques.

Apr 2251

Researchers from MIT, NVIDIA, and Zhejiang University Propose TriAttention: A KV Cache Compression Method That Matches Full Attention at 2.5× Higher Throughput

Learn how TriAttention, a new AI method, compresses memory in large language models to make them 2.5x faster without losing accuracy.

Apr 1136

Liquid AI’s New LFM2-24B-A2B Hybrid Architecture Blends Attention with Convolutions to Solve the Scaling Bottlenecks of Modern LLMs

Learn to build a hybrid neural network architecture that combines attention mechanisms with convolutional layers, similar to Liquid AI's LFM2-24B-A2B model, to address scaling bottlenecks in large language models.

Feb 2599